125 research outputs found

    Neural network for ordinal classification of imbalanced data by minimizing a Bayesian cost

    Get PDF
    Ordinal classification of imbalanced data is a challenging problem that appears in many real world applications. The challenge is to simultaneously consider the order of the classes and the class imbalance, which can notably improve the performance metrics. The Bayesian formulation allows to deal with these two characteristics jointly: It takes into account the prior probability of each class and the decision costs, which can be used to include the imbalance and the ordinal information, respectively. We propose to use the Bayesian formulation to train neural networks, which have shown excellent results in many classification tasks. A loss function is proposed to train networks with a single neuron in the output layer and a threshold based decision rule. The loss is an estimate of the Bayesian classification cost, based on the Parzen windows estimator, which is fitted for a thresholded decision. Experiments with several real datasets show that the proposed method provides competitive results in different scenarios, due to its high flexibility to specify the relative importance of the errors in the classification of patterns of different classes, considering the order and independently of the probability of each class.This work was partially supported by Spanish Ministry of Science and Innovation through Thematic Network "MAPAS"(TIN2017-90567-REDT) and by BBVA Foundation through "2-BARBAS" research grant. Funding for APC: Universidad Carlos III de Madrid (Read & Publish Agreement CRUE-CSIC 2023)

    On building ensembles of stacked denoising auto-encoding classifiers and their further improvement

    Get PDF
    To aggregate diverse learners and to train deep architectures are the two principal avenues towards increasing the expressive capabilities of neural networks. Therefore, their combinations merit attention. In this contribution, we study how to apply some conventional diversity methods-bagging and label switching- to a general deep machine, the stacked denoising auto-encoding classifier, in order to solve a number of appropriately selected image recognition problems. The main conclusion of our work is that binarizing multi-class problems is the key to obtain benefit from those diversity methods. Additionally, we check that adding other kinds of performance improvement procedures, such as pre-emphasizing training samples and elastic distortion mechanisms, further increases the quality of the results. In particular, an appropriate combination of all the above methods leads us to reach a new absolute record in classifying MNIST handwritten digits. These facts reveal that there are clear opportunities for designing more powerful classifiers by means of combining different improvement techniques. (C) 2017 Elsevier B.V. All rights reserved.This work has been partly supported by research grants CASI- CAM-CM ( S2013/ICE-2845, Madrid Community) and Macro-ADOBE (TEC2015-67719, MINECO-FEDER EU), as well as by the research network DAMA ( TIN2015-70308-REDT, MINECO )

    Pre-emphasizing Binarized Ensembles to Improve Classification Performance

    Get PDF
    14th International Work-Conference on Artificial Neural Networks, IWANN 2017Machine ensembles are learning architectures that offer high expressive capacities and, consequently, remarkable performances. This is due to their high number of trainable parameters.In this paper, we explore and discuss whether binarization techniques are effective to improve standard diversification methods and if a simple additional trick, consisting in weighting the training examples, allows to obtain better results. Experimental results, for three selected classification problems, show that binarization permits that standard direct diversification methods (bagging, in particular) achieve better results, obtaining even more significant performance improvements when pre-emphasizing the training samples. Some research avenues that this finding opens are mentioned in the conclusions.This work has been partly supported by research grants CASI-CAM-CM (S2013/ICE-2845, DGUI-CM and FEDER) and Macro-ADOBE (TEC2015-67719-P, MINECO)

    Boosting ensembles with controlled emphasis intensity

    Get PDF
    Boosting ensembles have deserved much attention because their high performance. But they are also sensitive to adverse conditions, such as noisy environments or the presence of outliers. A way to fight against their degradation is to modify the forms of the emphasis weighting which is applied to train each new learner. In this paper, we propose to use a general form for that emphasis function, which not only includes an error dependent and a proximity to the classification boundary dependent term, but also a constant value which serves to control how much emphasis is applied. Two convex combinations are used to consider these terms, and this makes possible to control their relative influence. Experimental results support the effectiveness of this general form of boosting emphasis.This work has been partly supported by research grants CASI-CAM-CM (S2013/ICE-2845,DGUI-CM), and Macro-ADOBE ( TEC2015- 67719-P, MINECO )

    A Bayes risk minimization machine for example-dependent cost classification

    Get PDF
    A new method for example-dependent cost (EDC) classification is proposed. The method constitutes an extension of a recently introduced training algorithm for neural networks. The surrogate cost function is an estimate of the Bayesian risk, where the estimates of the conditional probabilities for each class are defined in terms of a 1-D Parzen window estimator of the output of (discriminative) neural networks. This probability density is modeled with the objective of allowing an easy minimization of a sampled version of the Bayes risk. The conditional probabilities included in the definition of the risk are not explicitly estimated, but the risk is minimized by a gradient-descent algorithm. The proposed method has been evaluated using linear classifiers and neural networks, with both shallow (a single hidden layer) and deep (multiple hidden layers) architectures. The experimental results show the potential and flexibility of the proposed method, which can handle EDC classification under imbalanced data situations that commonly appear in this kind of problems.This work has been partly supported by grants CASI-CAM-CM (S2013/ICE-2845, Madrid C/ FEDER, EUSF) and MacroADOBE (TEC2015-67719-P, MINECO/FEDER, UE)

    Optimum Bayesian thresholds for rebalanced classification problems using class-switching ensembles

    Get PDF
    Asymmetric label switching is an effective and principled method for creating a diverse ensemble of learners for imbalanced classification problems. This technique can be combined with other rebalancing mechanisms, such as those based on cost policies or class proportion modifications. In this study, and under the Bayesian theory framework, we specify the optimal decision thresholds for the combination of these mechanisms. In addition, we propose using a gating network to aggregate the learners contributions as an additional mechanism to improve the overall performance of the system.We thank the anonymous reviewers for their valuable suggestions and comments. This work is partially funded by Project PID2021-125652OB-I00 from the Ministerio de Ciencia e Innovación of Spain. Funding for APC: Universidad Carlos III de Madrid (Read & Publish Agreement CRUE-CSIC 2022). In memoriam: Prof. Aníbal R. Figueiras-Vidal (1950-2022)

    A new boosting design of Support Vector Machine classifiers

    Get PDF
    Boosting algorithms pay attention to the particular structure of the training data when learning, by means of iteratively emphasizing the importance of the training samples according to their difficulty for being correctly classified. If common kernel Support Vector Machines (SVMs) are used as basic learners to construct a Real AdaBoost ensemble, the resulting ensemble can be easily compacted into a monolithic architecture by simply combining the weights that correspond to the same kernels when they appear in different learners, avoiding to increase the operation computational effort for the above potential advantage. This way, the performance advantage that boosting provides can be obtained for monolithic SVMs, i.e., without paying in classification computational effort because many learners are needed. However, SVMs are both stable and strong, and their use for boosting requires to unstabilize and to weaken them. Yet previous attempts in this direction show a moderate success. In this paper, we propose a combination of a new and appropriately designed subsampling process and an SVM algorithm which permits sparsity control to solve the difficulties in boosting SVMs for obtaining improved performance designs. Experimental results support the effectiveness of the approach, not only in performance, but also in compactness of the resulting classifiers, as well as that combining both design ideas is needed to arrive to these advantageous designs.This work was supported in part by the Spanish MICINN under Grants TEC 2011-22480 and TIN 2011-24533

    Plant identification via adaptive combination of transversal filters

    Get PDF
    For least mean-square (LMS) algorithm applications, it is important to improve the speed of convergence vs the residual error trade-off imposed by the selection of a certain value for the step size. In this paper, we propose to use a mixture approach, adaptively combining two independent LMS filters with large and small step sizes to obtain fast convergence with low misadjustment during stationary periods. Some plant identification simulation examples show the effectiveness of our method when compared to previous variable step size approaches. This combination approach can be straightforwardly extended to other kinds of filters, as it is illustrated with a convex combination of recursive least-squares (RLS) filters.Publicad
    corecore